3,373 research outputs found
Correction: A correlated topic model of Science
Correction to Annals of Applied Statistics 1 (2007) 17--35
[doi:10.1214/07-AOAS114]Comment: Published in at http://dx.doi.org/10.1214/07-AOAS136 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Rodeo: Sparse, greedy nonparametric regression
We present a greedy method for simultaneously performing local bandwidth
selection and variable selection in nonparametric regression. The method starts
with a local linear estimator with large bandwidths, and incrementally
decreases the bandwidth of variables for which the gradient of the estimator
with respect to bandwidth is large. The method--called rodeo (regularization of
derivative expectation operator)--conducts a sequence of hypothesis tests to
threshold derivatives, and is easy to implement. Under certain assumptions on
the regression function and sampling density, it is shown that the rodeo
applied to local linear smoothing avoids the curse of dimensionality, achieving
near optimal minimax rates of convergence in the number of relevant variables,
as if these variables were isolated in advance.Comment: Published in at http://dx.doi.org/10.1214/009053607000000811 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
High-dimensional Ising model selection using -regularized logistic regression
We consider the problem of estimating the graph associated with a binary
Ising Markov random field. We describe a method based on -regularized
logistic regression, in which the neighborhood of any given node is estimated
by performing logistic regression subject to an -constraint. The method
is analyzed under high-dimensional scaling in which both the number of nodes
and maximum neighborhood size are allowed to grow as a function of the
number of observations . Our main results provide sufficient conditions on
the triple and the model parameters for the method to succeed in
consistently estimating the neighborhood of every node in the graph
simultaneously. With coherence conditions imposed on the population Fisher
information matrix, we prove that consistent neighborhood selection can be
obtained for sample sizes with exponentially decaying
error. When these same conditions are imposed directly on the sample matrices,
we show that a reduced sample size of suffices for the
method to estimate neighborhoods consistently. Although this paper focuses on
the binary graphical models, we indicate how a generalization of the method of
the paper would apply to general discrete Markov random fields.Comment: Published in at http://dx.doi.org/10.1214/09-AOS691 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- β¦